Method and system for encoding fractional bitplanes

ABSTRACT

In a layered encoding system having at least one layer comprising a plurality of sub-layers ( 272, 274, 276 ), a method is disclosed herein for encoding a video image ( 200 ) composed of a plurality of pixel blocks containing at least one area determined to be significant ( 200, 215, 220 ) within a corresponding sub-layer ( 272, 274, 276 ). The method comprises the steps of; associating a level of significance with each block ( 250, 252 ) of a known size within the at least one significant area ( 200 ), associating a level of significance with successively larger blocks ( 222, 244 ) dependent upon the level of significance of at least one of the blocks ( 250, 252 ) of a known size contained within said larger block ( 222, 244 ), and mapping each of the associated levels of significance. In another embodiment of the invention, the significance map is transmitted and corresponding image layers may be reconstructed using the significance map.

The present invention relates to video image encoding and morespecifically to fractionally encoding enhancement layers of layerencoded video images.

Layer encoding, such as Fine Granular Scalar (FGS), and waveletencoding, are well-known in the video image encoding art. FGS encoding,for example, encodes video images into a base-layer and an enhancementlayer. The base layer represents the minimum image that that may betransmitted over a network with an acceptable quality. The enhancementlayer represents additional image details that may be transmitted overthe network when sufficient residual bandwidth is available.

Enhancement layers are encoded in a bit-plane format wherein the mostsignificant bits of each enhancement layer value are stored in a firstbit plane and each succeeding bit of each enhancement layer value isstored in a corresponding bit plane. During transmission of theenhancement layer, the values in each bit plane are successivelytransmitted until the available bandwidth is occupied.

A concept of fractional bit planes has been introduced in JPEG-2000 todifferentiate the importance of the various bits within a bit plane andimprove the efficiency of bit plane coding within a bit plane. Thisconcept does not exist in other layer encoding methods, such as FGS.Hence, there is a need for an encoding method and device wherein areasof the video image that are determined to be significant are identifiedprior to encoding the enhancement layer.

In the drawings:

FIG. 1 illustrates an FGS fractional bit plane encoder in accordancewith the principles of the present invention;

FIG. 2 illustrates a significance mapped enhancement layer bit plane;CONFIRMATION COPY

FIG. 3 a illustrates a flow chart of an exemplary block diagram foridentifying significant image areas within an image in accordance withthe principles of the invention;

FIG. 3 b illustrates a flow chart of an exemplary process for generatinga significance map in accordance with the principles of the invention;and

FIG. 4 illustrates a system for determining significance mappedenhancement layer bit planes in accordance with the principles of theinvention.

It is to be understood that these drawings are solely for purposes ofillustrating the concepts of the invention and are not intended as adefinition of the limits of the invention. The embodiments shown inFIGS. 1 through 4 and described in the accompanying detailed descriptionare to be used as illustrative embodiments and should not be construedas the only manner of practicing the invention. Also, the same referencenumerals, possibly supplemented with reference characters whereappropriate, have been used to identify similar elements.

In a layered encoding system having at least one layer comprising aplurality of sub-layers, a method is disclosed herein for encoding avideo image composed of a plurality of pixel blocks containing at leastone area determined to be significant within a corresponding sub-layer.The method comprises the steps of associating a level of significancewith each block of a known size within the at least one significantarea, associating a level of significance with each successively largerblock dependent upon the level of significance of at least one of theblocks of a known size contained within a successively larger block, andmapping each of the associated level of significance.

In another embodiment of the invention, the significance map istransmitted and corresponding image layers may be reconstructed usingthe significance map.

FIG. 1 illustrates a block diagram of an exemplary fractional bit planeencoder 100 in accordance with the principles of the present invention.In this diagram, input signal 110 is applied to summer 115, which ismixed with motion compensated images, as will be further discussed. Thecombined signal is then applied to Discrete Coefficient Transformation(DCT) 120 to convert pixel values into coefficients. The DCTcoefficients are next applied to quantizer 125 for quantization. Thequantized DCT coefficients are then applied to a Variable Length Coder130 and combiner 175.

The quantized DCT coefficients are also applied to inverse quantizer 135to restore the DCT coefficients. As should be understood, the restoredDCT coefficient are not exactly the same as the original DCT values assome information is lost in the quantization process. The inversequantized coefficients are next applied to inverse DCT 140 to recoverthe original pixel element after DCT and quantization processing.Similarly, a known difference between the original pixel elements andthe restored pixel elements exists because some information is lost inthe quantization process. The recovered pixel elements are applied tomotion estimator/motion compensator 145. The motionestimated/compensated signal is then applied to summing device 115 to becombined with the original image 110.

The summed image 150 is also applied to summing device 155 along withthe recovered pixel elements output from inverse DCT 140. The output ofsumming device is a residual element between the original signal 110 andrecovered base layer image. The residual image is concurrently appliedto enhancement layer encoder 160 and significance map encoder 165. Theresults of significance map encoder 165 are further applied toenhancement encoder 170 for mapping the bit planes as will be more fullydescribed.

The outputs of enhancement layer 170 and significance map 165 areapplied to combiner 180 and the combined output applied to combiner 175.The output 190 of combiner 175 may then be transmitted over a network orstored for subsequent transmission.

FIG. 2 a illustrates an image frame 200 containing significantinformation, such as changes in boundaries, color or texture.Significant images areas 210, 215, 220 may be identified using knownmethods. Correspondingly, areas that exhibit little or no change intextual may be identified as non-significant. Consequently, little or noinformation regarding these areas need be transmitted. Accordingly, inone embodiment of the invention, the determination of significant areasmay be done by reviewing each pixel element. In a preferred embodiment,the determination of significant areas may be done by reviewingcorresponding DCT coefficients.

FIG. 2 b illustrates another aspect of the present invention, wherein asignificant image area, for example 210, is associated with a pluralityof blocks, corresponding macroblocks, and correspondingsuper-macroblocks. Although a specific segmentation of the image isshown, it will be appreciated that the image may be segmented accordingto other criteria; as will be discussed below. In this illustratedexample, image area 210 is composed of super-macroblocks 222, 224, 226,228, 230 and 232. Each super-macroblock may be partitioned intomacroblocks. For clarity, super-macroblock 222 is shown partitioned intomacroblocks 240, 242, 244 and 246. Each macroblock 240, 242, 244 and 246may be further partitioned into a mini-macroblock. For clarity,macroblock 240 is shown partitioned into mini-macroblocks 250, 252, 254,and 256. Each mini-macroblock may be further partitioned into a block.For clarity purposes, mini-macroblock 250 is shown partitioned in toblocks 260, 262, 264 and 266. As will be appreciated, eachsuper-macroblock may be similarly partitioned, identified and associatedwith macro-, mini-macro-, and blocks.

In a preferred embodiment, block 260 contains information associatedwith an 8×8 configuration of pixel elements. Furthermore,mini-macroblock 250 is associated with a 16×16 configuration of pixelelements, macroblock 240 is associated with a 32×32 configuration ofpixel elements and super-macroblock 222 is associated with a 64×64configuration of pixel elements. In this preferred embodiment, block 260is analogous with the DCT encoding of a corresponding block of pixelelements.

FIG. 2 c illustrates the bit-plane mapping 270 of the identifiedsignificant area 210 in bit planes 272, 274, and 276 in accordance withthe preferred embodiment of the invention. In this case the enhancementlayer is encoded using a three-bit-bitplane. However, it should beunderstood that the depth of the bit-planes may be any number and thereis no intention to limit the bit-plane depth to that shown herein. Inthis preferred embodiment, since the DCT information is mapped to eachbit-plane, area 210 and associated super-macroblocks, macroblocks,mini-macro blocks, and blocks may be readily identified.

FIG. 3 a illustrates a flow chart of an exemplary process 300 forsignificance mapping in accordance with the principles of the invention.In this process significance mapping is initiated at an arbitrarilyselected bit plane associated with the image or picture. In theillustrated preferred embodiment, the bit-plane associated with themost-significant bits, i.e, bit-plane 0, is selected at block 305. Atblock 310, a significance map associated with the selected bit plane isdetermined. At block 315, the significance map associated with thebit-plane is coded. At block 320, the texture of the blocks identifiedas being significant are coded and a bit-wise representation of thesignificance map is generated. This bit-wise representation of thesignificance map can be decoded at the receiving device to understandthe significance map. At block 325, a determination is made whether allthe bit planes associated with the image have been processed. If theanswer is negative, then a next/subsequent bit plane is selected atblock 332 and the significance mapping process continues for selectednext/subsequent bit plane.

If, however, the answer is in the affirmative, then a determination ismade at block 330 whether all the images have been processed. If theanswer is negative, then a next/subsequent image or picture is selectedat block 334. The significance mapping process then continues for eachbit plane in the selected next/subsequent image or picture.

FIG. 3 b illustrates a flow chart of an exemplary significance mappingprocess 310. In this exemplary process an initial block size andassociated minimum and maximum block sizes are determined at block 340.In this case, an initial block size associated with the preferred blocksize is depicted. At block 345 a determination is made whether thecurrent block size is equal to the smallest block size. If the answer isin the affirmative, a determination is made at block 350, whether thecurrent block has any non-zero coefficients. If the answer is in theaffirmative, then the associated block is marked or identified as beingsignificant at block 355.

However, if the answer is negative, then the block is marked oridentified as being insignificant at block 370.

After identifying the current block as significant, at block 355, orinsignificant, at block 370, a determination is made at block 360whether the last block has been reached. If the answer is negative, thena next/subsequent block in the bit plane is selected at block 365.Processing continues on the selected next/subsequent block at block 345.

If, however, the answer at block 360 is in the affirmative, i.e., allblocks at current-size have been processed, then a determination is madewhether the current block-size is greater that the maximum block size.If the answer is in the negative, then the current block size isincreased, preferably doubled, at block 380. Processing continues oneach block associated with the increased size at block 345.

Returning to the determination at block 345, if the answer is negative,then a determination is made at block 385, whether smaller blocks, i.e.,children within the larger block, are significant. If the answer isaffirmative, then the larger block is marked or identified as beingsignificant at block 355. If, however, the answer is in the negative,then the larger block is marked or identified as being insignificant atblock 370.

Processing then continues on each of the successively larger block untilthe block size exceeds a maximum block size at block 375.

FIG. 4 illustrates an exemplary embodiment of a system 400 that may beused for implementing the principles of the present invention. System400 may represent a TV transmitter or receiving system, a desktop,laptop or palmtop computer, a personal digital assistant (PDA), avideo/image storage apparatus such as a video cassette recorder (VCR), adigital video recorder (DVR), a TiVO apparatus, etc., as well asportions or combinations of these and other devices. System 400 maycontain one or more input/output devices 402, processors 403, andmemories 404, which may access one or more sources 401 that containvideo images. Sources 401 may be stored in permanent or semi-permanentmedia such as a television receiver (SDTV or HDTV), a VCR, RAM, ROM,hard disk drive, optical disk drive or other video image storagedevices. Sources 401 may alternatively be accessed over one or morenetwork connections 410 for receiving video from a server or serversover, for example a global computer communications network such as theInternet, a wide area network, a metropolitan area network, a local areanetwork, a terrestrial broadcast system, a cable network, a satellitenetwork, a wireless network, or a telephone network, as well as portionsor combinations of these and other types of networks.

Input/output devices 402, processors 403, and memories 404 maycommunicate over a communication medium 406. Communication medium 406may represent for example, a bus, a communication network, one or moreinternal connections of a circuit, circuit card or other apparatus, aswell as portions and combinations of these and other communicationmedia. Input data from the sources 401 is processed in accordance withone or more software programs that may be stored in memories 404 andexecuted by processors 403 in order to supply fractionally encoded videoimages to network 420. The fractionally encoded vided images may betransmitted to a storage device, or may be transmitted to a displaysystem for real-time viewing of the encoded video image.

Processors 403 may be any means, such as general purpose or specialpurpose computing system, or may be a hardware configuration, such as alaptop computer, desktop computer, handheld computer, dedicated logiccircuit, integrated circuit, Programmable Array Logic (PAL), ApplicationSpecific Integrated Circuit (ASIC), etc., that provides a known outputin response to known inputs.

In a preferred embodiment, the coding and decoding employing theprinciples of the present invention may be implemented by computerreadable code executed by processor 403. The code may be stored in thememory 404 or read/downloaded from a memory medium such as a CD-ROM orfloppy disk (not shown). In other embodiments, hardware circuitry may beused in place of, or in combination with, software instructions toimplement the invention. For example, the elements illustrated hereinmay also be implemented as discrete hardware elements.

In one aspect of the invention, the term processor may represent one ormore processing units or computing units in communication with one ormore memory units and other devices, e.g., peripherals, connectedelectronically to and communicating with the at least one processingunit. Futhermore, the devices may be electronically connected to the oneor more processing units via internal busses, e.g., ISA bus,microchannel bus, PCI bus, PCMCIA bus, etc., or one or more internalconnections of a circuit, circuit card or other device, as well asportions and combinations of these and other communication media or anexternal network, e.g., the Internet and Intranet.

Fundamental novel features of the present invention have been shown,described, and pointed out as applied to preferred embodiments. Itshould be understood that various omissions and substitutions andchanges in the apparatus described, in the form and details of thedevices disclosed, and in their operation, may be made by those skilledin the art without departing from the spirit of the present invention.For example, although the present invention has been described withregard to FGS encoding, it should be understood that present inventionwould also be suitable for similarly developed layer encoding systems.Similarly, while super-macroblocks are discussed with regard to 64×64arrays or matrices, it should be within the knowledge of those skilledin the art to vary the block size. Furthermore, while the boundaries ofthe super-macroblocks are shown fixed, it is contemplated that thesuper-macroblock boundaries may be dynamically determined based on thefirst indication of significant data.

It is also expressly intended that all combinations of those elementswhich perform substantially the same function in substantially the sameway to achieve the same result are within the scope of the invention.Substitutions of elements from one described embodiment to another arealso fully intended and contemplated.

1. In a layered encoding system having at least one layer comprising aplurality of sub-layers, a method for encoding a video image (200),composed of a plurality of pixel blocks, containing at least one areadetermined to be significant (210) within a corresponding sub-layer(272, 274, 276), said method comprising the steps of: a. associating alevel of significance with each block of a known size (250, 252) withinsaid at least one significant area (210); b. associating a level ofsignificance with each of at least one successively larger blocks (222,244) dependent upon said level of significance of at least one of saidblocks (250, 252) of a known size contained within said successivelylarger block (222, 244); and c. mapping each of said associated levelsof significance.
 2. The method as recited in claim 1, further comprisingthe step of: repeating steps a-c for each of said sub-layers.
 3. Themethod as recited in claim 1, further comprising the step of:transmitting said significance level mapping corresponding to saidsub-layer.
 4. The method as recited in claim 1, wherein said layerencoding system is a Fine Granular Scalable (FGS) System.
 5. The methodas recited in claim 4, wherein said sub-layer is a bit-plane (272, 274,276).
 6. The method as recited in claim 1, wherein said block size isselected from a predetermined set of sizes.
 7. The method as recited inclaim 1, wherein said successively larger block has a known maximumvalue.
 8. A system (400) for encoding (100) a video image (200) formedas a plurality of pixel blocks into at least one layer wherein one ofsaid layers is composed of a plurality of sub-layers (272, 274, 276),said sub-layer including at least one significant area (210),comprising: means (165) for associating a level of significance witheach block of a known size (250, 252) within said at least onesignificant area (210); means (165) for identifying a level ofsignificance with each of at least one successively larger block (222,244) dependent upon said level of significance of at least one of saidblocks (250, 252) of a known size contained within said successivelylarger block (222, 244); and means (165) for mapping said level ofsignificance.
 9. The system as recited in claim 8, wherein said mappingincludes information regarding each of said blocks of known size andsuccessive blocks having a known level.
 10. The system as recited inclaim 8, wherein said known level is representative of a non-zerocoefficient.
 11. A decoding system for decoding images transmitted as alayer encoded signal, comprising: means for receiving data correspondingto a significance mapping of at least one sub-layer of said layeredencoding signal; means for decoding said significance map; and means forreconstructing a corresponding one for said sub-layers from saidsignificance map.
 12. The decoding system as recited in claim 11,further comprising: means for receiving said layer encoded signaltransmitted over a network.
 13. The decoding system as recited in claim11, wherein said significance map includes information regarding blockscontaining significant information.